learning latent variable model
Tensor decompositions for learning latent variable models
Anandkumar, Anima, Ge, Rong, Hsu, Daniel, Kakade, Sham M., Telgarsky, Matus
This work considers a computationally and statistically efficient parameter estimation method for a wide class of latent variable models---including Gaussian mixture models, hidden Markov models, and latent Dirichlet allocation---which exploits a certain tensor structure in their low-order observable moments (typically, of second- and third-order). Specifically, parameter estimation is reduced to the problem of extracting a certain (orthogonal) decomposition of a symmetric tensor derived from the moments; this decomposition can be viewed as a natural generalization of the singular value decomposition for matrices. Although tensor decompositions are generally intractable to compute, the decomposition of these specially structured tensors can be efficiently obtained by a variety of approaches, including power iterations and maximization approaches (similar to the case of matrices). A detailed analysis of a robust tensor power method is provided, establishing an analogue of Wedin's perturbation theorem for the singular vectors of matrices. This implies a robust and computationally tractable estimation approach for several popular latent variable models.
Principled Approaches for Learning Latent Variable Models
In any learning task, it is natural to incorporate latent or hidden variables which are not directly observed. For instance, in a social network, we can observe interactions among the actors, but not their hidden interests/intents, in gene networks, we can measure gene expression levels but not the detailed regulatory mechanisms, and so on. I will present a broad framework for unsupervised learning of latent variable models, addressing both statistical and computational concerns. We show that higher order relationships among observed variables have a low rank representation under natural statistical constraints such as conditional-independence relationships. These findings have implications in a number of settings such as finding hidden communities in networks, discovering topics in text documents and learning about gene regulation in computational biology.